
There are three do files that create ancillary files used to partition our main datasets and test for heterogeneous agglomeration.
These three files are:

- "Entry_Exit_Employment_RESTAT.do" (Stata do file)
- "Sector_Birth_Age_RESTAT.do" (Stata do file)
- "High_Low_Skills_LFS_RESTAT.do" (Stata do file)

The first two do files use the data from the BSD (Business Structure Database) accessible to authorised users via the UK Data Service Secure Lab.

The first do file (Entry_Exit_Employment_RESTAT) is used to estimate firm entry and exit rates as well as the size of existing firms and entrants.
This information is then merged to our main dataset on coagglomeration to characterise sector pairs that are dynamic/steady/mix or small/large/mix in terms of the characteristics of the firms belonging to the sector pair.

The second do file (Sector_Birth_Age_RESTAT) is used to estimate plant age in different sectors.
This is done in a number of ways, for example looking at the median age, age of the oldest plant and age of the youngest plant on average in a given sector over the years under consideration.
This information is used to characterise the pairs in the main data on coagglomeration as young/old/mixed.

The last do file (High_Low_Skills_LFS_RESTAT) uses information contained in the Labour Force Surveys (LFS).
The LFS can be accessed by authorised users via the UK Data Service Secure Lab. The LFS data stored on the Secure Lab have a detailed geographical identifiers (wards). 
These allow mapping individuals to Travel to Work Areas. This detail cannot be obtained outside the Secure Lab.
Using LFS data, the do file characterises sectors by the share of workers who have high/low qualifications. 
This information is then merged to the main data on coagglomeration to characterise industries in the pair as having high/low/mixed education.

The main data files coming out of these three do files are named:

- BSD_entry_exit_emp.dta
- birth_date_i.dta and birth_date_j.dta for sectors i and j in the industry pair
- high_di.dta and high_dj.dta for sectors i and j in the industry pair